Strongly Universal String Hashing is Fast

نویسندگان

  • Daniel Lemire
  • Owen Kaser
چکیده

We present fast strongly universal string hashing families: they can process data at a rate of 0.2 CPU cycle per byte. Maybe surprisingly, we find that these families— though they requires a large buffer of random numbers—are often faster than popular hash functions with weaker theoretical guarantees. Moreover, conventional wisdom is that hash functions with fewer multiplications are faster. Yet we find that they may fail to be faster due to operation pipelining. We present experimental results on several processors including low-powered processors. Our tests include hash functions designed for processors with the Carry-Less Multiplication (CLMUL) instruction set. We also prove, using accessible proofs, the strong universality of our families.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bucket Hashing and its Application

We introduce a new technique for constructing a family of universal hash functions. At its center is a simple metaphor: to hash a string x, cast each of its words into a small number of buckets; xor the contents of each bucket; then collect up all the buckets' contents. Used in the context of Wegman{Carter authentication, this style of hash function provides a fast approach for software message...

متن کامل

Bucket Hashing with a Small Key Size

In this paper we consider very fast evaluation of strongly universal hash functions, or equivalently, authentication codes. We show how it is possible to modify some known families of hash functions into a form such that the evaluation is similar to \bucket hashing", a technique for very fast hashing introduced by Rogaway. Rogaway's bucket hash family has a huge key size, which for common param...

متن کامل

String hashing for linear probing

Linear probing is one of the most popular implementations of dynamic hash tables storing all keys in a single array. When we get a key, we first hash it to a location. Next we probe consecutive locations until the key or an empty location is found. At STOC’07, Pagh et al. presented data sets where the standard implementation of 2-universal hashing leads to an expected number of Ω(log n) probes....

متن کامل

Performance in Practice of String Hashing Functions

String hashing is a fundamental operation, used in countless applications where fast access to distinct strings is required. In this paper we describe a class of string hashing functions and explore its performance. In particular, using experiments with both small sets of keys and a large key set from a text database, we show that it is possible to achieve performance close to that theoreticall...

متن کامل

Fast Universal Hashing with Small Keys and No Preprocessing: The PolyR Construction

We describe a universal hash-function family, PolyR, which hashes messages of effectively arbitrary lengths in 3.9–6.9 cycles/byte (cpb) on a Pentium II (achieving a collision probability in the range 2−16–2−50). Unlike most proposals, PolyR actually hashes short messages faster (per byte) than long ones. At the same time, its key is only a few bytes, the output is only a few bytes, and no “pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Comput. J.

دوره 57  شماره 

صفحات  -

تاریخ انتشار 2014